105 resultados para support vector machines

em Deakin Research Online - Australia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Appropriate choice of a kernel is the most important ingredient of the kernel-based learning methods such as support vector machine (SVM). Automatic kernel selection is a key issue given the number of kernels available, and the current trial-and-error nature of selecting the best kernel for a given problem. This paper introduces a new method for automatic kernel selection, with empirical results based on classification. The empirical study has been conducted among five kernels with 112 different classification problems, using the popular kernel based statistical learning algorithm SVM. We evaluate the kernels’ performance in terms of accuracy measures. We then focus on answering the question: which kernel is best suited to which type of classification problem? Our meta-learning methodology involves measuring the problem characteristics using classical, distance and distribution-based statistical information. We then combine these measures with the empirical results to present a rule-based method to select the most appropriate kernel for a classification problem. The rules are generated by the decision tree algorithm C5.0 and are evaluated with 10 fold cross validation. All generated rules offer high accuracy ratings.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The propensity of wool knitwear to form entangled fiber balls, known as pills, on the surface is affected by a large number of factors. This study examines, for the first time, the application of the support vector machine (SVM) data mining tool to the pilling propensity prediction of wool knitwear. The results indicate that by using the binary classification method and the radial basis function (RBF) kernel function, the SVM is able to give high pilling propensity prediction accuracy for wool knitwear without data over-fitting. The study also found that the number of records available for each pill rating greatly affects the learning and prediction capability of SVM models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Uncertainty is known to be a concomitant factor of almost all the real world commodities such as oil prices, stock prices, sales and demand of products. As a consequence, forecasting problems are becoming more and more challenging and ridden with uncertainty. Such uncertainties are generally quantified by statistical tools such as prediction intervals (Pis). Pis quantify the uncertainty related to forecasts by estimating the ranges of the targeted quantities. Pis generated by traditional neural network based approaches are limited by high computational burden and impractical assumptions about the distribution of the data. A novel technique for constructing high quality Pis using support vector machines (SVMs) is being proposed in this paper. The proposed technique directly estimates the upper and lower bounds of the PI in a short time and without any assumptions about the data distribution. The SVM parameters are tuned using particle swarm optimization technique by minimization of a modified Pi-based objective function. Electricity price and demand data of the Ontario electricity market is used to validate the performance of the proposed technique. Several case studies for different months indicate the superior performance of the proposed method in terms of high quality PI generation and shorter computational times.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Uncertainty of the electricity prices makes the task of accurate forecasting quite difficult for the electricity market participants. Prediction intervals (PIs) are statistical tools which quantify the uncertainty related to forecasts by estimating the ranges of the future electricity prices. Traditional approaches based on neural networks (NNs) generate PIs at the cost of high computational burden and doubtful assumptions about data distributions. In this work, we propose a novel technique that is not plagued with the above limitations and it generates high-quality PIs in a short time. The proposed method directly generates the lower and upper bounds of the future electricity prices using support vector machines (SVM). Optimal model parameters are obtained by the minimization of a modified PI-based objective function using a particle swarm optimization (PSO) technique. The efficiency of the proposed method is illustrated using data from Ontario, Pennsylvania-New Jersey-Maryland (PJM) interconnection day-ahead and real-time markets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The support vector machine (SVM) is a popular method for classification, well known for finding the maximum-margin hyperplane. Combining SVM with l1-norm penalty further enables it to simultaneously perform feature selection and margin maximization within a single framework. However, l1-norm SVM shows instability in selecting features in presence of correlated features. We propose a new method to increase the stability of l1-norm SVM by encouraging similarities between feature weights based on feature correlations, which is captured via a feature covariance matrix. Our proposed method can capture both positive and negative correlations between features. We formulate the model as a convex optimization problem and propose a solution based on alternating minimization. Using both synthetic and real-world datasets, we show that our model achieves better stability and classification accuracy compared to several state-of-the-art regularized classification methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Accurate forecasting of wind power generation is quite an important as well as challenging task for the system operators and market participants due to its high uncertainty. It is essential to quantify uncertainties associated with wind power generation forecasts for their efficient application in optimal management of wind farms and integration into power systems. Prediction intervals (PIs) are well known statistical tools which are used to quantify the uncertainty related to forecasts by estimating the ranges of the future target variables. This paper investigates the application of a novel support vector machine based methodology to directly estimate the lower and upper bounds of the PIs without expensive computational burden and inaccurate assumptions about the distribution of the data. The efficiency of the method for uncertainty quantification is examined using monthly data from a wind farm in Australia. PIs for short term application are generated with a confidence level of 90%. Experimental results confirm the ability of the method in constructing reliable PIs without resorting to complex computational methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Although the hyper-plane based One-Class Support Vector Machine (OCSVM) and the hyper-spherical based Support Vector Data Description (SVDD) algorithms have been shown to be very effective in detecting outliers, their performance on noisy and unlabeled training data has not been widely studied. Moreover, only a few heuristic approaches have been proposed to set the different parameters of these methods in an unsupervised manner. In this paper, we propose two unsupervised methods for estimating the optimal parameter settings to train OCSVM and SVDD models, based on analysing the structure of the data. We show that our heuristic is substantially faster than existing parameter estimation approaches while its accuracy is comparable with supervised parameter learning methods, such as grid-search with crossvalidation on labeled data. In addition, our proposed approaches can be used to prepare a labeled data set for a OCSVM or a SVDD from unlabeled data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Spam is commonly defined as unsolicited email messages and the goal of spam filtering is to differentiate spam from legitimate email. Much work have been done to filter spam from legitimate emails using machine learning algorithm and substantial performance has been achieved with some amount of false positive (FP) tradeoffs. In this paper, architecture of spam filtering has been proposed based on support vector machine (SVM,) which will get better accuracy by reducing FP problems. In this architecture an innovative technique for feature selection called dynamic feature selection (DFS) has been proposed which is enhanced the overall performance of the architecture with reduction of FP problems. The experimental result shows that the proposed technique gives better performance compare to similar existing techniques.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data pre-processing always plays a key role in learning algorithm performance. In this research we consider data pre-processing by normalization for Support Vector Machines (SVMs). We examine the normalization affect across 112 classification problems with SVM using the rbf kernel. We observe a significant classification improvement due to normalization. Finally we suggest a rule based method to find when normalization is necessary for a specific classification problem. The best normalization method is also automatically selected by SVM itself.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Plasminogen (Pg), the precursor of the proteolytic and fibrinolytic enzyme of blood, is converted to the active enzyme plasmin (Pm) by different plasminogen activators (tissue plasminogen activators and urokinase), including the bacterial activators streptokinase and staphylokinase, which activate Pg to Pm and thus are used clinically for thrombolysis. The identification of Pg-activators is therefore an important step in understanding their functional mechanism and derives new therapies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Electronic medical record (EMR) offers promises for novel analytics. However, manual feature engineering from EMR is labor intensive because EMR is complex - it contains temporal, mixed-type and multimodal data packed in irregular episodes. We present a computational framework to harness EMR with minimal human supervision via restricted Boltzmann machine (RBM). The framework derives a new representation of medical objects by embedding them in a low-dimensional vector space. This new representation facilitates algebraic and statistical manipulations such as projection onto 2D plane (thereby offering intuitive visualization), object grouping (hence enabling automated phenotyping), and risk stratification. To enhance model interpretability, we introduced two constraints into model parameters: (a) nonnegative coefficients, and (b) structural smoothness. These result in a novel model called eNRBM (EMR-driven nonnegative RBM). We demonstrate the capability of the eNRBM on a cohort of 7578 mental health patients under suicide risk assessment. The derived representation not only shows clinically meaningful feature grouping but also facilitates short-term risk stratification. The F-scores, 0.21 for moderate-risk and 0.36 for high-risk, are significantly higher than those obtained by clinicians and competitive with the results obtained by support vector machines.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Currently mobile spam has been a major menace to the development of wireless networks. In this paper, the mobile spam problem and its countermeasures are analysed. In particular, we propose a Support Vector Machine to filter mobile spam. This mobile spam filtering system can be deployed in current wireless networks and achieve good performance in protecting end users and operators from mobile spam. Legislation issues and challenges to defend mobile spam are also discussed in the latter part of this paper.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Spam is commonly defined as unsolicited email messages and the goal of spam categorization is to distinguish between spam and legitimate email messages. Many researchers have been trying to separate spam from legitimate emails using machine learning algorithms based on statistical learning methods. In this paper, an innovative and intelligent spam filtering model has been proposed based on support vector machine (SVM). This model combines both linear and nonlinear SVM techniques where linear SVM performs better for text based spam classification that share similar characteristics. The proposed model considers both text and image based email messages for classification by selecting an appropriate kernel function for information transformation.